Fast and accurate estimation of the covariance between pairwise maximum likelihood distances

نویسندگان

  • Manuel Gil
  • Maria Anisimova
چکیده

Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation of Parameters for an Extended Generalized Half Logistic Distribution Based on Complete and Censored Data

This paper considers an Extended Generalized Half Logistic distribution. We derive some properties of this distribution and then we discuss estimation of the distribution parameters by the methods of moments, maximum likelihood and the new method of minimum spacing distance estimator based on complete data. Also, maximum likelihood equations for estimating the parameters based on Type-I and Typ...

متن کامل

Step change point estimation in the multivariate-attribute process variability using artificial neural networks and maximum likelihood estimation

In some statistical process control applications, the combination of both variable and attribute quality characteristics which are correlated represents the quality of the product or the process. In such processes, identification the time of manifesting the out-of-control states can help the quality engineers to eliminate the assignable causes through proper corrective actions. In this paper, f...

متن کامل

Covariance Estimation for High Dimensional Data Vectors Using the Sparse Matrix Transform

Covariance estimation for high dimensional vectors is a classically difficult problem in statistical analysis and machine learning. In this paper, we propose a maximum likelihood (ML) approach to covariance estimation, which employs a novel sparsity constraint. More specifically, the covariance is constrained to have an eigen decomposition which can be represented as a sparse matrix transform (...

متن کامل

Factors affecting the errors in the estimation of evolutionary distances between sequences.

Phylogenetic methods that use matrices of pairwise distances between sequences (e.g., neighbor joining) will only give accurate results when the initial estimates of the pairwise distances are accurate. For many different models of sequence evolution, analytical formulae are known that give estimates of the distance between two sequences as a function of the observed numbers of substitutions of...

متن کامل

Maximum likelihood estimates of pairwise rearrangement distances.

Accurate estimation of evolutionary distances between taxa is important for many phylogenetic reconstruction methods. Distances can be estimated using a range of different evolutionary models, from single nucleotide polymorphisms to large-scale genome rearrangements. Corresponding corrections for genome rearrangement distances fall into 3 categories: Empirical computational studies, Bayesian/MC...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2014